Localizing web videos using social images

نویسندگان

  • Liujuan Cao
  • Xianming Liu
  • Wei Liu
  • Rongrong Ji
  • Thomas S. Huang
چکیده

While inferring the geo-locations of web images has been widely studied, there is limited work engaging in geo-location inference of web videos due to inadequate labeled samples available for training. However, such a geographical localization functionality is of great importance to help existing video sharing websites provide location-aware services, such as location-based video browsing, video geo-tag recommendation, and location sensitive video search on mobile devices. In this paper, we address the problem of localizing web videos through transferring large-scale web images with geographic tags to web videos, where near-duplicate detection between images and video frames is conducted to link the visually relevant web images and videos. To perform our approach, we choose the trustworthy web images by evaluating the consistency between the visual features and associated metadata of the collected images, therefore eliminating the noisy images. In doing so, a novel transfer learning algorithm is proposed to align the landmark prototypes across both domains of images and video frames, leading to a reliable prediction of the geo-locations of web videos. A group of experiments are carried out on two datasets which collect Flickr images and YouTube videos crawled from the Web. The experimental results demonstrate the effectiveness of our video geo-location inference approach which outperforms several competing approaches using the traditional frame-level video geo-location inference. 2014 Elsevier Inc. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Localizing Web Videos from Heterogeneous Images

While geo-localization of web images has been widely studied, limited effort is devoted to that of web videos. Nevertheless, an accurate location inference approach specified on web videos is of fundamental importance, as it’s occupying increasing proportions in web corpus. The key challenge comes from the lack of sufficient labels for model training. In this paper, we tackle this problem from ...

متن کامل

Localizing and segmenting text in images and videos

Many images—especially those used for page design on web pages—as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. In this paper, we propose a novel method for localizing and segmenting text in complex images and videos. Text lines are ide...

متن کامل

Identifying Unsafe Videos on Online Public Media using Real-time Crowdsourcing

Due to the significant growth of social networking and human activities through the web in recent years, attention to analyzing big data using real-time crowdsourcing has increased. This data may appear in the form of streaming images, audio or videos. In this paper, we address the problem of deciding the appropriateness of streaming videos in public media with the help of crowdsourcing in real...

متن کامل

On Social Network Web Sites: Definition, Features, Architectures and Analysis Tools

Development and usage of online social networking web sites are growing rapidly. Millions members of these web sites publicly articulate mutual "friendship" relations and share user-created contents, such as photos, videos, files, and blogs. The advances in web designing technology and fast growing usage of online resources prompted web designers to improve features and architectures of social ...

متن کامل

Weakly Supervised Action Recognition and Localization Using Web Images

This paper addresses the problem of joint recognition and localization of actions in videos. We develop a novel Transfer Latent Support Vector Machine (TLSVM) by using Web images and weakly annotated training videos. In order to alleviate the laborious and timeconsuming manual annotations of action locations, the model takes training videos which are only annotated with action labels as input. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 302  شماره 

صفحات  -

تاریخ انتشار 2015